05:00
Please use a computer and network that will allow you to use Google products (gmail, Google Drive, etc.). Some workplaces do not allow this on their devices!
If you don’t already have a personal (not work) Google Cloud Platform account, you will set one up in the first section of today’s course. You will need:
If you’re an established GCP user, you might be past your “free tier” usage this month and may accrue a small charge for today’s work, but it’s unlikely.
05:00
“BigQuery is a fully managed enterprise data warehouse that helps you manage and analyze your data with built-in features like machine learning, geospatial analysis, and business intelligence. BigQuery’s serverless architecture lets you use SQL queries to answer your organization’s biggest questions with zero infrastructure management.”
This slide deck was built in Quarto!
Joy Payton (she/her)
Data Scientist / Data Educator
https://www.linkedin.com/in/joypayton/
I have no conflicts of interest to report.
Please grab these links and use them!
This slide show: (once you open the slide show you can use ‘s’ to open speaker notes) GitHub repository:
Hour 1: GCP and BigQuery Orientation
Hour 2: BigQuery Data, SQL in BigQuery, R Tools
Hour 3: R/RStudio and BigQuery Integration
GCP, or Google Cloud Platform, is a public cloud provider.
It’s similar to other offerings you may have heard of, like AWS or Azure.
Cloud providers are increasingly important in medicine!
If you already have a GCP account that you’ll be using for this workshop
Option A: go get a cup of coffee and we’ll see you back here in about 15 minutes. You might end up spending money on our activities today!
Option B (recommended): stick around to create a new account with a brand new “welcome to GCP” free trial worth $300 in GCP services
If you need to create a new Google identity, please go to https://accounts.google.com now and create a new account. Even if you already have one, this is a way to guarantee you’ll be working in the free tier with some Google credits!
05:00
Have your Google identity? Now you can go to https://console.cloud.google.com to sign up for GCP.
You have to do two things:
First, agree to terms (the easy part).
Check the box and click “Agree…”
Now, start the free trial: a bit more complex.
05:00
Let’s experiment. Click on the Gemini “star” and ask a question about BigQuery or GCP. I thought it might be interesting to ask about BigQuery and medicine.
Google Cloud Platform (GCP) organizes resources by project.
Optionally, you can also define an organization and group projects by folder.
To get started with BigQuery, you will:
Optionally, add other resources:
03:00
Once you’re in your project, click the “burger” menu (☰).
BigQuery is probably already pinned (click on it). If you don’t see BigQuery:
Choose “View all Products”
In “Analytics”, click on BigQuery (you may also want to “pin” this to the top of your menu).
Enable the BigQuery API to add BigQuery to your project
Look for Area Deprivation Index (ADI) by searching for it or looking in the healthcare category.
02:00
In the “View Dataset” screen, the specific dataset will be highlighted in the left panel, which shows your current data. Please click on the star icon.
Look for the CDC Births data by searching for it or looking in the healthcare category. View it and “star” it!
03:00
Since there are so many public datasets, Google no longer displays those by default in BigQuery, even though you have access to them. That’s why we “starred” them.
Now we can select to show “only starred data”.
Let’s take a break. So far you:
During break, you can either relax, or, if you want:
20:00
Joy Payton, Children’s Hospital of Philadelphia